Prion-like low complexity regions enable avid virus-host interactions during HIV-1 infection

Cellular proteins CPSF6, NUP153 and SEC24C play crucial roles in HIV-1 infection. While weak interactions of short phenylalanine-glycine (FG) containing peptides with isolated capsid hexamers have been characterized, how these cellular factors functionally engage with biologically relevant mature HIV-1 capsid lattices is unknown. Here we show that prion-like low complexity regions (LCRs) enable avid CPSF6, NUP153 and SEC24C binding to capsid lattices. Structural studies revealed that multivalent CPSF6 assembly is mediated by LCR-LCR interactions, which are templated by binding of CPSF6 FG peptides to a subset of hydrophobic capsid pockets positioned along adjoining hexamers. In infected cells, avid CPSF6 LCR-mediated binding to HIV-1 cores is essential for functional virus-host interactions. The investigational drug lenacapavir accesses unoccupied hydrophobic pockets in the complex to potently impair HIV-1 inside the nucleus without displacing the tightly bound cellular cofactor from virus cores. These results establish previously undescribed mechanisms of virus-host interactions and antiviral action.

In contrast from X-ray crystal structures, which show six CPSF6 FG peptides bound to six hydrophobic pockets in an isolated CA hexamer: This is incorrect. Both Price et al and Bhattacharya et al structures showed that not all of the six binding sites in the P212121 hexamer are occupied by the FG peptide.
TRIM5α-TRIM5α assemblies are mediated by both coiled-coil interactions and B-box 2 domain interactions, not just the coiled-coil Lack of line numbers made reviewing onerous.
Reviewer #2 (Remarks to the Author): Wei et al apply a combination of structural, biochemical and virological assays to characterize interactions between the HIV-1 capsid and cellular binding partners CPSF6, NUP153 and SEC24C. This is an interesting and important question. They conclude that interactions between low complexity regions within the host cell proteins are essential for binding and act by increasing the avidity of interaction with the capsid. They further conclude that the capsid binding drug lenacapavir binds to unoccupied binding pockets on capsid without the need to displace bound host proteins. In my opinion, the data convince that the properties of the regions flanking the FG peptide contribute to binding of CPSF6 to HIV-1 capsid lattices both in vitro and in cells. In my view the data are not strong enough to support the proposed structural mechanism.
I have the following major concerns. 1. A control should be included to show that flanking sequences containing typical levels of charged residues do not interfere with binding of the FG repeat region to the pocket in capsid. For example, LCR-FG-LCR and the sequences where LCR is replaced with "typical resides" should bind with similar affinity to isolated capsid hexamers where there is no avidity contribution, albeit with low uM-mM affinities.
2. The cryoEM experiments are performed with CA and CPSF6 concentrations of above 100uM. To understand expected behaviour in this concentration range it would be helpful to include a control titration with the different CPSF6 constructs, perhaps similar to that shown for one such construct in Extended figure 8. Why do the control constructs not bind the tubes at this high concentration given the presence of the FG sequence? If a peptide is added at this concentration, does it bind all six sites on the hexamer?
3. My main concerns relate to the cryoEM work: -The authors claim that the structure allows modelling of the FG repeat -at the measured stoichiometry the FG repeat should be clearly visible within its binding pocket in two of six CA molecules in the cryoEM reconstruction at similar density to CA, but I cannot see it and would not be able to model it. Why not? If it is clearly visible, please include a figure clearly showing the FG repeat. This can be further validating my making sure that it matches the X-ray structure filtered to the same resolution.
-The resolution of the cryoEM structure is not expected to be the same at all radii. The weak additional density at high radius is presumably resolved at lower resolution than the strong capsid density. A plot of resolution at different radii within the helical reconstruction should be provided (for an example for TMV, see Fig 5G in Fromm et al, J. Struct Biol 2015). The high radius features should be filtered to the measured radial or local resolution. What is the resolution of which LCR and GST regions are resolved? They should be filtered to this measured resolution prior to interpretation. The measured resolution will likely be substantially lower than that of the capsid domains, I am very skeptical that it can be detailed enough to allow modelling of the LCR.
-The authors must carefully rule out the possibility that the density interpreted as the LCR is helically symmetrized noise. CA tubes adopt mixtures of different helical parameters but only one structure is presented here for each construct. It should also be possible to reconstruct other tube families from the same dataset with different helical parameters. If the authors interpretation is correct, the LCR density should be the same in all reconstructions. Repeating the data collection and reconstruction with a redetermination of helical symmetry would also be a useful validation.
-If the above validations can be performed, it is then also necessary to validate that the model interpreted from the density is correct. To me the link between Fig 4 d and e is tenuous -what other possible arrangements of the protein could be accommodated in the density?
The manuscript entitled, 'Prion-like low complexity regions enable avid virus-host interactions during HIV=-1 infection,' by Wei et al examines the structural basis of interactions between CPSF6 and the HIV capsid. This is of significance as interactions between capsid and cellular proteins such as CPSF6, NUP153 and SEC24c. All of the proteins are similar in that they contain a central phenylalanine/glycine rich domain that has been described to interact with the capsid. These interactions are critical to the viral life cycle as the cellular proteins are involved in shuttling of the capsid through the cytoplasm and into the nucleus. The authors set out to describe the interactions with these proteins (with a focus on CPSF6) with the complete and organized capsid to identify why the previous interaction of the FG regions was at relatively low affinity. They accomplish this through a series of biochemical, virologic and structural studies that define the importance of prion-like low complexity domains that surround the FG region. These LCR are of interest as they are uncharged regions of amino acids that can take on the shape necessary to interact with a target molecule, thus driving a templated interaction. Through application of their various technique and the evaluation of mutant CPSF6 constructs the authors show convincing evidence that LCRs are critical to CPSF6 functional interaction with capsid -particular the Nterminal LCR. This manuscript expands our understanding of how CSPF6 interacts with the capsid lattice and likely gives us clues to similar interactions with the nuclear pore. Analysis of the effect of the novel antiviral LEN reveal possible mechanism of LEN activity (stabilization of the CORE). Additionally, the nature of CPSF6 binding (to areas between hexamers) lends support to the idea that HIV cores present in the nucleus are still structurally intact. It is well written and adds to the field. I do have the following concerns: -Each of the critical initial pieces of data (Table 1,  It is impossible to make necessary comparisons between these (ie is LCR needed compared to FG alone). Although mostly covered in later assays this makes the critical first figure a little weak.
- Fig 1A-C -There appears to be a size shift in CPSF6 in the bound form. Is this seen consistently? If real, this could indicate a modification that is associated with binding (given this is lysate a possibility) -do the structural studies suggest unmodified CPSF6? Additionally, the nonspecific band ~55 kDA in panel B appears to be depleted by the presence of CA, but is not found in the CA bound fraction. WHat is this band and where did it go? -In general, it would be best to have the technique used to generate the data specifically named in the figure legend or the text. The methods section is complete and I could find everything, but it was often hard to determine which assay was actually used (ie 1D-F,) - Figure 2 -The PLA assay shows convincing co-localization of CSPF6 and CA in the nucleus. CPSF6 is controlled for in the supplementary materials. CA is not. Subsequent data (Fig 3) suggest an accumulation of CA at the periphery of the nucleus. This could be indicative of a deficit in entry through the nuclear pore. Although the overall findings suggest a difference in interaction, the data presented here could be a loss of localization due to a decreased presence of CA in the nucleus. This is an important distinction as it would help to identify the effect of interactions in the cytoplasm versus the nucleus. Comparisons used for the t-test should be explicitly stated.
- Figure 3 -The microscopy suggests the accumulation of the PIC at the periphery of the nucleus. This could be improper positioning after transit through the pore, as would be supported by the increase in LAD associated integrants, but some of the foci appear to be outside of the nucleus. Where are these complexes located? Does CPSF6 alteration change the rate of nuclear import? A time course to establish overall kinetics of movement into the nucleus and association with nuclear speckles would address this and further define the importance of CPSF6 in the process. This becomes more critical in light of later discussions regarding the stability of the core in the nucleus (and how it is affected by LEN). A time course would establish that the process has been slowed as opposed to completed early.
- Figure 3D-F -How many total integrants were sequenced per sample per cell? The change in localization in 3A suggests that there may be less integration overall. Is 3E a percentage of integrants near a gene dense region? or the average number of genes within 1Mb?
-Extended data 3B -Co-localization should be quantified. C6/NE appears to be showing the same pattern as the LCR containing mutants.
- Figure 5 -Panel A suggests that association with the core still occurs without the critical anchoring domain. This seems contradictory to much of the findings.
- Figure 6 -This is an excellent set of experiments. It might be helpful to have a schematic similar to 6A placed earlier in the manuscript so that the reader can visually examine the various LCRs used.
- Figure 7 -What are the baseline differences in infection of each of the different cell types?
-The authors conclude that LEN either displaces CPSF6 or can interact with the CPSF6/CA complex. The final explanation given would seem to be a third option: that LEN interacts with a different area of the CA not occupied by CPSF6.
-Do your structural models allow you to predict how many hexamers on the core (not a nanotube) are occupied by CPSF6? Do you think the irregular shape of the capsid influences this? -The conclusion that CPSF6:CPSF6 interactions are templated by binding of the FG to CA is likely true, but not conclusively proven by this data alone. 1

Response to Referees Letter
We have found the reviewers' comments constructive and revised the manuscript accordingly. Please note the following. i) The revised text is in red.  Table 3. iii) For clarity and to include protein sequences where needed, as requested by reviewer 3, we have reorganized previously presented data by expanding total numbers of main and extended data figures. iv) We Below is our point-by-point response to reviewers' comments. If this assumption is correct, sub-stoichiometric binding of the CPSF6 FG peptide to a CAhex would not be surprising given very low binding affinity for these interactions. We also note that PDB: 4U0C (P6 space group) and PDB: 4U0D (P212121 space group) show 6 and 3 NUP153 FG peptides bound to a CAhex. Again, low affinity binding of FG peptides could be the primary reason for sub-stoichiometric interactions of the NUP153 FG peptide + CAhex observed in certain crystal forms. To avoid any confusion with respect to the particular sentence cited above by the reviewer, we removed the reference to the previous studies and instead, we now compare our X-ray and cryo-EM results (see page 15): "In contrast to our X-ray crystal structure (Extended Data Fig 12a), which shows six CPSF6 FG peptides bound to six hydrophobic pockets in an isolated CA hexamer, our cryo-EM studies reveal the differential stoichiometry of CPSF6(LCR-FG-LCR) molecules binding to only two out of six cognate sites in the context of extended CA lattices (Fig 5g)."

Comment: TRIM5α-TRIM5α assemblies are mediated by both coiled-coil interactions and B-box 2 domain interactions, not just the coiled-coil
Response: Agreed and corrected. The revised sentence on page 17 reads: "TRIM5α-TRIM5α assemblies are mediated by B-box 2 and coiled-coil interactions resulting in a TRIM5α hexagonal cage surrounding the hexameric CA lattices"

Comment: Lack of line numbers made reviewing onerous.
Response: We apologize. Line numbers are now included.  Response: Images of our CPSF6-decorated CA tubes show that CPSF6 binding is not uniform, but instead seems to happen in patches. This is consistent with what is seen for many (if not most) proteins that decorate helical tubes. The combined effect of this substoichiometric CPSF6 binding and the limited resolution of our cryo-EM helical maps makes FG peptide density hard to detect. We used a pseudo-single particle approach to analyze our data and, in the resulting map, we can see an indication of FG peptide density in the expected positions. However, due to the substoichiometric CPSF6 binding, FG peptide density is weaker than CA density and is apparent only at relatively low thresholds. We felt that the quality of that pseudo-single particle map was not sufficiently high to merit inclusion in the manuscript. In contrast, displaying the symmetrized helical map at an appropriate threshold shows clearly how CPSF6 density contacts CA hexamers and this is shown in the new Fig. 5g. We know from results of published x-ray crystallography, biochemistry and virology studies that the FG peptide directly engages with the CA hydrophobic pocket. In agreement with these well-established results (which are confirmed by our own x-ray crystallography, biochemistry and virology results presented in our manuscript), our cryo-EM studies show that CPSF6 density extends toward specific CA hexamer hydrophobic pockets, whereas the map calculated from GST-CPSF6 (ΔFG) does not show any additional density around CA hexamers. Furthermore, our cryo-EM observation that CPSF6 makes two contacts on neighboring CA hexamers is consistent with our biochemical data indicating a 2:6 CPSF6:CA interaction stoichiometry. We think that the convergence of several independent lines of experimental evidence strongly supports the conclusion that the GST-CPSF6 binding to CA tubes that we see in our cryo-EM maps is due to direct interactions of CPSF6 FG peptide to cognate CA hydrophobic pockets. The structural details of FG motif binding to CA hexamers have been well-established. The principal contribution from our study is uncovering the mechanism for avid binding of CPSF6 (and likely SEC24C and NUP153) to a mature capsid lattice, and our cryo-EM map provides critical information by demonstrating multivalent CPSF6 LCR assembly along adjoining capsid hexamers. For context, published cryo-EM maps of TRIM-5α bound to CA tubes have a comparably limited resolution. Yet, large hexagonal lattices of polyvalent TRIM-5α assemblies on top of the mature capsid lattice can be clearly seen. Those maps could not delineate how the short TRIM-5α peptide (the SPRY-motif) binds to CA hexamers and, since there is no high-resolution crystal structure of the SPRY-motif bound to CA hexamers, the structural details of TRIM-5α binding to CA remain unknown. Yet, cryo-EM studies of CA-TRIM-5α interaction have been instrumental for understanding the mechanism (validated by virology assays) behind avid binding of TRIM-5α to mature capsid.

Comment: A control should be included to show that flanking sequences containing typical levels of charged residues do not interfere with binding of the FG repeat region to the pocket in capsid. For example, LCR-FG-LCR and the sequences where LCR is replaced with "typical resides" should bind with similar affinity to isolated capsid
Comment: -The resolution of the cryoEM structure is not expected to be the same at all radii. The weak additional density at high radius is presumably resolved at lower resolution than the strong capsid density. A plot of resolution at different radii within the helical reconstruction should be provided (for an example for TMV, see Fig 5G in Fromm et al, J. Struct Biol 2015). The high radius features should be filtered to the measured radial or local resolution. What is the resolution of which LCR and GST regions are resolved? They should be filtered to this measured resolution prior to interpretation. Response: We thank the reviewer for making this excellent point. In the original version of the manuscript we low-pass filtered our cryo-EM maps to facilitate interpretation of poorly ordered density, but the reviewer's suggestion to filter the maps according to local resolution is clearly a better approach than applying a uniform lowpass filter. We have calculated local resolution values and filtered the maps accordingly (see Extended Data 9).

Comment:
The measured resolution will likely be substantially lower than that of the capsid domains, I am very skeptical that it can be detailed enough to allow modelling of the LCR. Response: We apologize for this unintentionally misleading general statement, and for not explaining the purpose and outcomes of our initial modeling efforts. The reviewer is absolutely correct that our cryo-EM results are not at the resolution needed to provide specific details for LCR conformations. Instead, our initial efforts have focused on making sure that our interpretation of GST-CPSF6 density was correct. We have made the following observations: 1) The available X-ray structure of GST fits well within the respective cryo-EM density; 4 2) The N-terminal LCR chain could be readily placed within the CPSF6 density. Furthermore, N-terminal LCR accounted for the bulk of CPSF6 density; 3) The length of the N-terminal LCR allowed it to readily extend to the CPSF6:CA contact points (new Fig 5g) thereby enabling the FG peptide binding to cognate hydrophobic CA pockets as seen by X-ray crystallography. 4) The close proximity of LCR chains from different CPSF6 molecules have suggested extensive LCR-LCR interactions. Collectively, these spatial constraints together with corroborating findings from HDX experiments (protections in LCRs), biochemical assays (2:6 CPSF6:CA binding stoichiometry), and extensive virology assays (indicating the biological significance of CPSF6 LCRs and more specifically of the N-terminal LCR), allowed us to propose the LCR-LCR mediated mechanism for polyvalent CPSF6 assembly onto mature CA lattice. However, as we have acknowledged above limited resolution of cryo-EM maps did not allow us to delineate different LCR conformations or specific details of LCR-LCR interactions. To make this absolutely clear, we now offer a more conservative, diagrammatic representation of CPSF6-CPSF6 interactions (see Extended Data Fig 11b) that is consistent with the limited resolution of our cryo-EM maps. Furthermore, we have accordingly revised the cryo-EM sections of the text on pages 8 and 9.
Importantly, we have now extended our efforts to perform all atom (AA) molecular dynamic ( i) The N-terminal LCRs primarily contribute to the CPSF6-CPSF6 assembly forming a network of highly interacting CPSF6 chains templated by the CA lattice; ii) N-terminal LCRs adopt multiple conformations, which allow effective assembly of the highly interactive network of CPSF6 chains.
Also note that the AA MD simulation results regrading conformational flexibility of LCR-LCR interactions are consistent with our experimental results showing that native CPSF6 LCR can effectively be replaced by other prion-like LCRs with completely different primary structures from unrelated proteins. These findings are consistent with what is generally known about prion-like LCR-LCR interactions. When prion-like LCRs are brought in close proximity (in the case with CPSF6 by its FG motif binding to adjacently positioned CA hexamers), they form highly interactive network of hydrophobic chains, which typically adopt multiple conformations. Taken together, our AA MD simulations further support of our extensive virology, biochemistry, cryo-EM and HDX-MS results, which collectively reveal the previously undescribed mechanism of prion-like LCR mediated high affinity binding of CPSF6 to CA lattices.
Comment: -The authors must carefully rule out the possibility that the density interpreted as the LCR is helically symmetrized noise. CA tubes adopt mixtures of different helical parameters but only one structure is presented here for each construct. It should also be possible to reconstruct other tube families from the same dataset with different helical parameters. If the authors interpretation is correct, the LCR density should be the same in all reconstructions. Repeating the data collection and reconstruction with a redetermination of helical symmetry would also be a useful validation. Response: Another excellent point. We have been well-aware of the risk of misinterpreting artifactual density resulting from unidentified problems in helical processing and had calculated a second CPSF6-CA map from a different subset of helical images to verify that what we considered CPSF6-related density was real. As the reviewer suggested, we now include Extended Data Fig 11a showing the excellent correspondence (accounting for differences in helical symmetry) of CPSF6 density in both, independently calculated helical maps.

Comment: -If the above validations can be performed, it is then also necessary to validate that the model interpreted from the density is correct. To me the link between Fig 4 d and e is tenuouswhat other possible arrangements of the protein could be accommodated in the density?
Response: Also a very good point. Although its resolution is limited, the cryo-EM map of CPSF6-CA tubes uniquely determines the "topology" of CPSF6 interaction with the CA hexamer lattice by identifying the position of GST tags, the points of LCR interaction with the CA lattice and the position of LCR-related density. This structural information, along with the length of LCR domains and information from complementary biochemical and HDX results, was considered to put forward the diagrammatic representation of LCR interactions presented in a revised Extended Data Fig. 11b.
In conclusion, we would like to make the following general point. Structural studies with disordered FG containing prion-like LCRs are notoriously difficult. Therefore, it is remarkable that we've been able to account for the CPSF6 LCR density in our Cryo-EM maps and detect protection in LCR by HDX-MS. Despite limited resolution of our cryo-EM maps (as expected from conformationally flexible LCR-LCR interactions), our principal structural findings --i) CPSF6 molecules contact two adjoining CA hexamers (Fig 5g), and ii) LCR-LCR interactions populate zig-zagging density extending between the rows of adjoining hexamers (Fig 5h) --strongly support our main discovery of the previously undescribed mechanism of prion-like LCR mediated, avid, polyvalent binding of FG motif containing CPSF6 (and likely also NUP153 and SEC24C) to mature CA lattices.

REVIEWER #3.
Comment: -Each of the critical initial pieces of data (Table 1, Fig 1A-C, and FIg 1D-F) are all performed using different substrates and different assays. The table shows binding affinity with hexamers (that are shown in 1G to have weaker binding in general), panels A-C use nanotubes and D-F use intact HIV cores. It is impossible to make necessary comparisons between these (ie is LCR needed compared to FG alone). Although mostly covered in later assays this makes the critical first figure a little weak.  Fig 1A-C -There appears to be a size shift in CPSF6 in the bound form. Is this seen consistently? If real, this could indicate a modification that is associated with binding (given this is lysate a possibility) -do the structural studies suggest unmodified CPSF6? Response: The apparent differences in migration patterns in the previous image that were correctly noticed by the reviewer were due to different NaCl content in tested samples: cell lysates and pulled-down fractions contained low and very high NaCl, respectively. Now we have adjusted NaCl content to be very similar in all fractions, which revealed very similar migration patterns for CPSF6 in cell lysates and pull-down fractions (see New Figure 1a).